UMND2 : SenseClusters Applied to the Sense Induction Task of Senseval-4
نویسنده
چکیده
SenseClusters is a freely–available open– source system that served as the University of Minnesota, Duluth entry in the SENSEVAL-4 sense induction task. For this task SenseClusters was configured to construct representations of the instances to be clustered using the centroid of word cooccurrence vectors that replace the words in an instance. These instances are then clustered using k–means where the number of clusters is discovered automatically using the Adapted Gap Statistic. In these experiments SenseClusters did not use any information outside of the raw untagged text that was to be clustered, and no tuning of the system was performed using external corpora.
منابع مشابه
Duluth : Word Sense Induction Applied to Web Page Clustering
The Duluth systems that participated in task 11 of SemEval–2013 carried out word sense induction (WSI) in order to cluster Web search results. They relied on an approach that represented Web snippets using second–order co– occurrences. These systems were all implemented using SenseClusters, a freely available open source software package.
متن کاملSenseClusters - Finding Clusters that Represent Word Senses
SenseClusters is a freely available word sense discrimination system that takes a purely unsupervised clustering approach. It uses no knowledge other than what is available in a raw unstructured corpus, and clusters instances of a given target word based only on their mutual contextual similarities. It is a complete system that provides support for feature selection from large corpora, several ...
متن کاملRegularized Least-Squares classification for Word Sense Disambiguation
The paper describes RLSC-LIN and RLSCCOMB systems which participated in the Senseval-3 English lexical sample task. These systems are based on Regularized Least-Squares Classification (RLSC) learning method. We describe the reasons of choosing this method, how we applied it to word sense disambiguation, what results we obtained on Senseval1, Senseval-2 and Senseval-3 data and discuss some possi...
متن کاملStructural semantic interconnection: a knowledge-based approach to Word Sense Disambiguation
In this paper we describe the SSI algorithm, a structural pattern matching algorithm for WSD. The algorithm has been applied to the gloss disambiguation task of Senseval-3.
متن کاملIdentifying Similar Words and Contexts in Natural Language with SenseClusters
SenseClusters is a freely available intelligent system that clusters together similar contexts in natural language text. Thereafter it assigns identifying labels to these clusters based on their content. It is a purely unsupervised approach that is language independent, and uses no knowledge other than what is available in raw un-annotated corpora. In addition to clustering similar contexts, it...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007